Deep-Manager: a versatile tool for optimal feature selection in live-cell imaging analysis

One of the major problems in bioimaging, often highly underestimated, is whether features extracted for a discrimination or regression task will remain valid for a broader set of similar experiments or in the presence of unpredictable perturbations during the image acquisition process. Such an issue is even more important when it is addressed in the context of deep learning features due to the lack of a priori known relationship between the black-box descriptors (deep features) and the phenotypic properties of the biological entities under study. In this regard, the widespread use of descriptors, such as those coming from pre-trained Convolutional Neural Networks (CNNs), is hindered by the fact that they are devoid of apparent physical meaning and strongly subjected to unspecific biases, i.e., features that do not depend on the cell phenotypes, but rather on acquisition artifacts, such as brightness or texture changes, focus shifts, autofluorescence or photobleaching. The proposed Deep-Manager software platform offers the possibility to efficiently select those features having lower sensitivity to unspecific disturbances and, at the same time, a high discriminating power. Deep-Manager can be used in the context of both handcrafted and deep features. The unprecedented performances of the method are proven using five different case studies, ranging from selecting handcrafted green fluorescence protein intensity features in chemotherapy-related breast cancer cell death investigation to addressing problems related to the context of Deep Transfer Learning. Deep-Manager, freely available at https://github.com/BEEuniroma2/Deep-Manager, is suitable for use in many fields of bioimaging and is conceived to be constantly upgraded with novel image acquisition perturbations and modalities.

The Deep-Manager platform allows users to perform specific sensitivity tests on their own images dataset to select the most appropriate features for the specific classification task. Sensitivity tests aim to detect which features extracted from ad hoc algorithms (handcrafted) or a pre-specified Deep Learning network through a transfer learning approach are more sensitive to external quantities and phenomena that are acquisition-specific. Among the vast panorama of acquisition devices and experimental set-up, to prove the effectiveness of the proposed method, we selected three of the most used practical contexts in the field of biological image analysis: 2D transmission light time-lapse microscopy, 3D phase-contrast time-lapse microscopy, and 3D fluorescence time-lapse microscopy. The implemented sensitivity tests are therefore thought for those contexts. However, the list of possible tests of the Deep-Manager platform could be enlarged in the future to other fields such as histopathological imaging or indirect immunofluorescence. For this reason, in the remainder, we will indicate the present release as Deep-Manager 1.0 version. Link : https://github.com/BEEuniroma2/Deep-Manager

IM-ACQ-1
Test 1) Brightness artifact The brightness test randomly applies an overall brightness change to each image in the dataset by numerical adding a luminance value to all the pixels in the image within a user-defined range. In this way, we simulate the drift in the luminance level of the lamp or the sudden changes in acquisition setup, for example, due to external light sources' influence (opening of the incubator, changes in room settings) 1 .
By indicating with ( , ) each single image entering the brightness artifact experiment and by denoting with l0 a generic luminance level in the range [-lmin, lmax], then the modified image Ib(x,y) is obtained by , ( , ) = max (0, min( ( , ) + -, 1)) (1) where the minimum and the maximum operators are needed to assure the final range of the image being in [0,1] as it is expected.
An example of the brightness artifact is shown in Fig. S1. Brightening l =+0.1 darkening l =-0.1

Test 2) Stage multi-positioning artifact
Stage multi-positioning is a frequent practice in time-lapse microscopy experiments for saving time and conducting parallel experiments. However, the exact repositioning of the stage is not always easy to achieve, and a very small object movement can be observed (in the order of 1 pixel, i.e., less than 1 µm).
To simulate such an effect, the Cartesian domain of the image is randomly rotated and translated of an angle, and a shift term on the x-axis and y-axis in a given range defined by the user.
By indicating with x and y the original coordinate system, with lx and ly the shift terms in x and y respectively, , and with the rotation angle, ∈ [ /01 , /2. ], then we define the modified coordinate system as where the centre of rotation is considered as the geometrical centre of the image.
An example of the movement artifact is shown in Fig. S2.

Test 3) Out-of-focus artifact
Experiments conducted over living samples are critical, especially for maintaining culture composition as stable as possible, i.e., oxygen and carbon dioxide concentration, temperature, and relative humidity. Evaporation of small quantities of liquid and/or related changes of compounds' concentration may create effects such as out-of-focus. Similarly, cell death or relevant changes in cell dimension may induce out-of-focus effects. Out-of-focus effects appear like a blurring phenomenon with decreasing cell details and increasing cell dimension. To simulate out-of-focus effects, we applied a disk-shaped spatial filter to each image, with a radius R in the range [Rmin, Rmax]. The Point Spread Function (PSF) of the diskshaped filter is given by 2 .
and the application of the filter over the image is achieved by convolution as follows: An example of the out-of-focus artifact is shown in Fig. S3.

IM-ACQ-2
3D phase-contrast TL microscopy is an optical microscopy technique that converts phase shifts in light to brightness changes in the image. When light waves travel through a medium, their interaction causes the wave amplitude and phase to change in a manner dependent on the properties of the medium. Phase-contrast microscopy is particularly important in biology. It reveals many cellular structures that are invisible under a bright-field microscope. The phase-contrast microscope allows biologists to study living cells and how they proliferate through cell division without staining the cells, which requires additional preparation. And often the death of the cells.

Test 1) Brightness variation
Even if the phase-contrast imaging is much more robust to general brightness variation, it may occur especially as inter-experiment variations, namely across different experiments, using different microscopy devices. General brightness variation may also be an operator-dependent artifact due to the manual adjustment of the image contrast and luminance that may occur during the acquisition.
Brightness variation in phase-contrast imaging can be achieved through Eq. 1, by properly setting lmin=0 and lmax to a quite low value since, on average, the background is very dark and the image can be only brightened. An example of the brightness artifact in phase-contrast imaging is shown in Fig. S4. Test 2) Local-out of focus Phase-contrast imaging is usually used to acquire images from a 3D environment in which cells move in the space in every direction. To simulate such kind of local out-of-focus, we apply the filter in Eq. 3 with radius R in the range [10 ÷30] because, in the 3D domain, cells are free to move in a larger space.

Supplementary
An example of the local out-of-focus artifact in phase-contrast imaging is shown in Fig. S5.

Supplementary Figure 5.
Visual example of out-of-focus with filtering radius = 30 pixels (left) and with radius=20 pixels (right). Scale bar is 100µm.

Test 3) Gel texture variation
In living cell experiments, cells are cultured into matrices mainly composed of collagen (e.g., type I) 3 , which is the major component of the tumor microenvironment and participates in cancer fibrosis. Collagen biosynthesis can be regulated by cancer cells through mutated genes, transcription factors, signaling pathways, and receptors; furthermore, collagen can influence tumor cell behavior through integrins, discoidin domain receptors, tyrosine kinase receptors, and some signalling pathways. By transmission electron microscope, collagen is appeared to be structured in oriented microfibers. When visualized in phase-contrast imaging, collagen fibers are almost invisible due to the practical diffraction limits close to 0.2 µm, but due to some heterogeneity, the collagen texture may be visible in some regions of the field of view. With the aim to simulate such an artifact, we generate a striped pattern that is then attenuated by a factor ∈ [ /01 , /2. ] and over-imposed on each image. Orientation q of the patterns is set randomly in the range [qmin, qmax]. An example of an artificial gel pattern image and related modifications is shown in Fig. S6 A).
An example of the image of the striped patterns and of the collagen pattern artifact in phase-contrast imaging is shown in Fig. S6 B.

IM-ACQ-3
The fluorescence microscopy technique has become an essential tool in biology and the biomedical sciences due to the ability to extract information at a deeper scale with the respect to the traditional optical microscopy 4 . The application of an array of fluorochromes has made it possible to identify cells and sub-microscopic cellular components with a high degree of specificity 5 . In fact, the fluorescence microscope is capable of revealing the presence of a single molecule. Such phenomena are usually correlated with biological events such as cell death 4 or cell replication. Different artifact effects may occur in fluorescence microscopy, such as autofluorescence, photobleaching, and fluorescence saturation.

Test 1) Autofluorescence
Autofluorescence is the natural emission of light by biological structures such as mitochondria, lysosomes, elastin, and collagen 6 when they have absorbed light, and makes it complicated to distinguish the light originating from artificially added fluorescent markers 7 . Autofluorescence spectra are generally broad, extending over several hundred nanometers. Hence, its interference is often significant at the same emission wavelengths as GFP leading to low signal-to-noise ratios and loss of contrast and clarity in fluorescence microscope images. Autofluorescence is observed in regions with no strong marker emission and may occur when applying local contrast enhancement approaches. Of course, there can be found non-cellular (i.e., background autofluorescence) as well as cellular autofluorescence. Due to the large variety and specificity of cellular autofluorescence sources 6 , the DM tool includes the implementation of a background autofluorescence test. Further versions will consider the case of anisotropic subcellular autofluorescence phenomena.
Therefore, to simulate the presence of autofluorescence, we apply a brightness increase only to the object background in the channel devoted to the cell localization (usually the red). To do this, we first locate the cells stained in red, then extract the local background and apply a brightness increase to this latter one. Brightness increase is achieved by applying the method described in iv) only to the red channel that is used in the presented case study to stain lung tumor cells.
An example of the autofluorescence artifact image is shown in Fig. S7.

Supplementary Figure 7. Visual example of the autofluorescence artifact shown in the red channel with = 0.3 (left) and = 0.2 (right). Scale bar is 100µm.
Test 2) Photobleaching Photobleaching (also fading) is the photochemical alteration of a fluorophore molecule leading it to a permanent disabling to fluoresce caused by non-specific reactions between the fluorophore and surrounding molecules 8 . In microscopy, photobleaching may complicate the observation of fluorescent molecules since they will eventually be destroyed by the light exposure necessary to stimulate them into fluorescing. Photobleaching also occurs when cells replicate after dividing the cytoplasm hence diluting the die. Even if photobleaching is also a method to reduce autofluorescence 6 due to the diverse decaying of GFP signal and background autofluorescence, it may also strongly affect the final achievements of the experiment. To simulate photobleaching, we decrease the red response (but the same holds for the green channel or both) of stained lung cancer cells taken from fluorescence time-lapse microscopy images. The background is assumed to remain dark (no autofluorescence mechanism is simultaneously present). Parameter variations are introduced by setting the percentage of emitted signal decreasing with respect to the value at the first frame. We define a so-called bleaching ratio, br , as the ratio between the average red response in the bleached image and in the image at the first frame. Future expansions of such test will consider, when time-lapse microscopy is available, exponential decaying bleaching phenomenon. An example of the photobleaching artifact image is shown in Fig. S8  Bleaching ratio = 0.5 Bleaching ratio = 0.7

Test 3) Fluorescence saturation
Fluorescence saturation effects may impact the spatial resolution of time-lapse microscopy images and induce a loss of details in examining cell parts and functionalities. This can be observed, for example, in cancer cells stained for cell death 4 . To simulate such an artifact, we induce luminance saturation in green-stained cancer cells a certain time before they go into final death. In this way, the evolution of the cell death phenomenon is also biased towards a stronger effect.
Mathematically, denoting with I(x,y) the image in the red channel, then we apply the following piecewise linear operation to obtain the saturated image IS(x,y) 2 .
Supplementary Figure 9. Visual example of the fluorescence saturation artifact shown in the red channel with ℎ < = 0.2 (left) and ℎ < = 0.3 (right). Scale bar is 100µm.

Supplementary Note 1.2 Feature available
The DM platform allows two distinct modalities: 1. handcrafted intensity and texture features 2. Deepfeatures from Deep Transfer Learning (DTL) algorithm. Users with programming skills may also add customized functions with specific additional features.

Handcrafted features
By default, the platform proposes some well-known intensity and texture descriptors that are computed over the original image (or the image subjected to perturbations). The list of available intensity descriptors is: average intensity, median intensity, the standard deviation of the intensity, minimum intensity, 10 th percentile of the intensity, 25 th percentile of the intensity, 75 th percentile of the intensity, 90 th percentile of the intensity, maximum intensity, entropy of the intensity 2 . Regarding the texture descriptors, the DM platform includes Haralick features 9 and Histogram of Oriented Gradient features (HoG) 10 . Haralick features represent the statistics of the so-called gray-level co-occurrence matrix, a different rearrangement of the image information taking into account spatial dependence and local pixel similarity. The HoG features are a representation of the distribution of intensity gradients or edge directions. The image is divided into small connected regions, and for the pixels within each region, a histogram of gradient directions is compiled. The descriptors are the concatenation of these histograms.

Deep-features
Deep Transfer Learning (DTL) is an approach in deep learning (and machine learning) where knowledge is transferred from one model to another. We can use transfer learning to solve a particular task using an entire or part of a model already pre-trained on a different task. DTL can treat the pre-trained neural Saturation threshold = 0.2 Saturation threshold = 0.3 network as a feature extractor by discarding the last fully-connected output layer. This approach allows for using a lightweight linear classification or regression model (Support Vector Machine, Linear Discriminant Analysis, Support Vector Regression, Multivariate Linear Regression 11 starting from the extracted features and allows the use of a network already trained for days or weeks on state-of-the-art machines. By selecting different deep layers, the input image is encoded into a different number of descriptors, from detailed representation (higher layers) to coarser encoding (very deep layers). By default, the DM platform includes several well-known deep learning architectures: ResNET101 12 , VGG19 13 , NasNETLarge 13 , and DenseNET201 14 . Each network presents so-called pooling layers that reduce the data dimensions by combining the outputs of neuron clusters at one layer into a single neuron in the next layer 15 . The result of using a pooling layer and creating downsampled or pooled feature maps is a summarized version of the features detected in the input. They are useful thanks to the fact that small changes in the location of the feature in the input detected by the convolutional layer will result in a pooled feature map with the feature in the same location. This capability added by pooling is called the model's invariance to local translation.

Supplementary Note 1.3 Software design and utilization
The Deep-Manager platform has been realized in Python 3.8 open-source language in the Anaconda framework. The overall platform architecture has been thought for different levels of expertise. A text file is fed to the DM software, including a list of parameters and related range values to be used in the artifacts implementation and application. A unique text file is available for all the tests so that the user may repeatedly run the platform by modifying a unique SETTING file. Advanced users may also modify the tests or add a new one by properly including the setting parameters in the SETTING file.
In the following, we list the main steps of the DM functionalities.
STEP1. The user is first asked to select the practical scenario to work on (such selection allows the platform to save final selection results into a specific file numbered according to the test number (e.g., 2D TL microscopy, 3D Phase Contrast TL microscopy, or 3D Fluorescence). All the tests available for the selected modality are applied.
STEP2. The user is then asked to select the SETTING text file to load the DM configuration. The parameters used are listed in the Algorithm Parameter Section. The file also includes the name of the network used for the transfer learning and the layer used to extract the features, if applicable. Specific details are provided in the previous sections for each test.
STEP3. The user is then asked to select the path where the training dataset of images is stored. Details can be found in the DM Guide https://github.com/BEEuniroma2/Deep-Manager.
STEP4. The user is asked to select the handcrafted or the DTL modality. As a consequence, if the handcrafted selection is chosen, the platform automatically calculates a set of texture and intensity features. If DTL is selected, the platform reads the setting information in the SETTING file mentioned above. DM applies the perturbations according to the tests described above and computes the features before and after the perturbation.
STEP5. The user may visualize perturbation effects on images selected at random. It is also possible to visualize in a 2D plot values of DP vs. SENSITIVITY for the selected and unselected features. Using the two values achieved for each descriptor fi, i.e., fi0 and fimod, before and after the perturbations, the software derives the individual Discriminant Power (DP) values as follows: and then, it computes the Sensitivity (SENS) of descriptor fi to the added perturbation as follows: where ( 0-) =>2<<5 =>2<<8 indicates the area under the roc curve 16 of feature 0-in discriminating class1 from class2. The software applies a threshold value ℎ AB to classify descriptors according to the DP values and a threshold value ℎ ;DE; to classify descriptors according to the sensitivity values. In light of this, descriptors are classified into different regions: high DP and low SENS (those selected) having DP higher than thDP and sensitivity lower than thSENS (cyan markers in Fig. S10) , high SENS, i.e., those rejected due to the high sensitivity larger than thSENS to the artifact (blue markers in Fig. S10), and low DP smaller than thDP, i.e., those rejected because of their low discriminant power (green markers in Fig. S10). The threshold values are loaded in the SETTING text file and may be modified by the user since they strongly depend on the application. The user should tune the two hyperparameters, thDP and thSENS, in order to select a non-empty and small set of features (usually from 10 to 100). Typical values for DP and SENS thresholds are thDP in [0.6 -0.7], and thSENS in [0.1 -0.2], respectively.